spark2.1操作json（save/read）

建筑物配置信息：

case class BuildingConfig(buildingid: String, building_height: Long, gridcount: Long, gis_display_name: String, wear_loss: Double, path_loss: Double) extends Serializable

向hdfs写入json文件：

 sql(
      s"""|select buildingid,
          |height,
          |gridcount,
          |collect_list(gis_display_name)[0] as gis_display_name,
          |avg(wear_loss) as wear_loss,
          |avg(path_loss) as path_loss
          |from
          |xxx
          |""".stripMargin)
      .map(s => BuildingConfig(s.getAs[String]("buildingid"), s.getAs[Int]("height"), s.getAs[Long]("gridcount"), s.getAs[String]("gis_display_name"), s.getAs[Double]("wear_loss"), s.getAs[Double]("path_loss")))
      .toDF.write.format("org.apache.spark.sql.json").mode(SaveMode.Overwrite).save(s"/user/my/buidlingconfigjson/${p_city}")

从hdfs中读取json文件：

 /**
      * scala> buildingConfig.printSchema
      * root
      * |-- building_height: long (nullable = true)
      * |-- buildingid: string (nullable = true)
      * |-- gis_display_name: string (nullable = true)
      * |-- gridcount: long (nullable = true)
      * |-- path_loss: double (nullable = true)
      * |-- wear_loss: double (nullable = true)
      **/
    spark.read.json(s"/user/my/buildingconfigjson/${p_city}")
      .map(s => BuildingConfig(s.getAs[String]("buildingid"), s.getAs[Long]("building_height"), s.getAs[Long]("gridcount"), s.getAs[String]("gis_display_name"), s.getAs[Double]("wear_loss"), s.getAs[Double]("path_loss")))
      .createOrReplaceTempView("building_scene_config")

posted @ 2018-03-14 00:41 cctext 阅读(2040) 评论(0) 编辑收藏举报

刷新页面返回顶部

yy

基础才是编程人员应该深入研究的问题，警告自己问题解决不了时，多从运行原理底层研究后再考虑方案。

spark2.1操作json（save/read）

公告